Beyond the Twilight Zone: automated prediction of structural properties of proteins by recursive neural networks and remote homology information.

نویسندگان

  • Catherine Mooney
  • Gianluca Pollastri
چکیده

The prediction of 1D structural properties of proteins is an important step toward the prediction of protein structure and function, not only in the ab initio case but also when homology information to known structures is available. Despite this the vast majority of 1D predictors do not incorporate homology information into the prediction process. We develop a novel structural alignment method, SAMD, which we use to build alignments of putative remote homologues that we compress into templates of structural frequency profiles. We use these templates as additional input to ensembles of recursive neural networks, which we specialise for the prediction of query sequences that show only remote homology to any Protein Data Bank structure. We predict four 1D structural properties - secondary structure, relative solvent accessibility, backbone structural motifs, and contact density. Secondary structure prediction accuracy, tested by five-fold cross-validation on a large set of proteins allowing less than 25% sequence identity between training and test set and query sequences and templates, exceeds 82%, outperforming its ab initio counterpart, other state-of-the-art secondary structure predictors (Jpred 3 and PSIPRED) and two other systems based on PSI-BLAST and COMPASS templates. We show that structural information from homologues improves prediction accuracy well beyond the Twilight Zone of sequence similarity, even below 5% sequence identity, for all four structural properties. Significant improvement over the extraction of structural information directly from PDB templates suggests that the combination of sequence and template information is more informative than templates alone.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequence representation and prediction of protein secondary structure for structural motifs in twilight zone proteins.

Characterizing and classifying regularities in protein structure is an important element in uncovering the mechanisms that regulate protein structure, function and evolution. Recent research concentrates on analysis of structural motifs that can be used to describe larger, fold-sized structures based on homologous primary sequences. At the same time, accuracy of secondary protein structure pred...

متن کامل

The Optimization of Forecasting ATMs Cash Demand of Iran Banking Network Using LSTM Deep Recursive Neural Network

One of the problems of the banking system is cash demand forecasting for ATMs (Automated Teller Machine). The correct prediction can lead to the profitability of the banking system for the following reasons and it will satisfy the customers of this banking system. Accuracy in this prediction are the main goal of this research. If an ATM faces a shortage of cash, it will face the decline of bank...

متن کامل

PREDICTION OF NONLINEAR TIME HISTORY DEFLECTION OF SCALLOP DOMES BY NEURAL NETWORKS

This study deals with predicting nonlinear time history deflection of scallop domes subject to earthquake loading employing neural network technique. Scallop domes have alternate ridged and grooves that radiate from the centre. There are two main types of scallop domes, lattice and continuous, which the latticed type of scallop domes is considered in the present paper. Due to the large number o...

متن کامل

Neural Prediction of Buckling Capacity of Stiffened Cylindrical Shells

Estimation of the nonlinear buckling capacity of thin walled shells is one of the most important aspects of structural mechanics. In this study the axial buckling load of 132 stiffened shells were numerically calculated. The applicability of artificial neural networks (ANN) in predicting the buckling capacity of vertically stiffened shells was studied. To this end feed forward (FF) multi-layer ...

متن کامل

Prediction of protein structural class for the twilight zone sequences.

Structural class characterizes the overall folding type of a protein or its domain. This paper develops an accurate method for in silico prediction of structural classes from low homology (twilight zone) protein sequences. The proposed LLSC-PRED method applies linear logistic regression classifier and a custom-designed, feature-based sequence representation to provide predictions. The main adva...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proteins

دوره 77 1  شماره 

صفحات  -

تاریخ انتشار 2009